104 research outputs found

    Incorporating feature ranking and evolutionary methods for the classification of high-dimensional DNA microarray gene expression data

    Get PDF
    Background: DNA microarray gene expression classification poses a challenging task to the machine learning domain. Typically, the dimensionality of gene expression data sets could go from several thousands to over 10,000 genes. A potential solution to this issue is using feature selection to reduce the dimensionality. Aim The aim of this paper is to investigate how we can use feature quality information to improve the precision of microarray gene expression classification tasks. Method: We propose two evolutionary machine learning models based on the eXtended Classifier System (XCS) and a typical feature selection methodology. The first one, which we call FS-XCS, uses feature selection for feature reduction purposes. The second model is GRD-XCS, which uses feature ranking to bias the rule discovery process of XCS. Results: The results indicate that the use of feature selection/ranking methods is essential for tackling high-dimensional classification tasks, such as microarray gene expression classification. However, the results also suggest that using feature ranking to bias the rule discovery process performs significantly better than using the feature reduction method. In other words, using feature quality information to develop a smarter learning procedure is more efficient than reducing the feature set. Conclusion: Our findings have shown that extracting feature quality information can assist the learning process and improve classification accuracy. On the other hand, relying exclusively on the feature quality information might potentially decrease the classification performance (e.g., using feature reduction). Therefore, we recommend a hybrid approach that uses feature quality information to direct the learning process by highlighting the more informative features, but at the same time not restricting the learning process to explore other features

    A comparative study of evolutionary approaches to the bi-objective dynamic Travelling Thief Problem

    Get PDF
    Dynamic evolutionary multi-objective optimization is a thriving research area. Recent contributions span the development of specialized algorithms and the construction of challenging benchmark problems. Here, we continue these research directions through the development and analysis of a new bi-objective problem, the dynamic Travelling Thief Problem (TTP), including three modes of dynamic change: city locations, item profit values, and item availability. The interconnected problem components embedded in the dynamic problem dictate that the effective tracking of good trade-off solutions that satisfy both objectives throughout dynamic events is non-trivial. Consequently, we examine the relative contribution to the non-dominated set from a variety of population seeding strategies, including exact solvers and greedy algorithms for the knapsack and tour components, and random techniques. We introduce this responsive seeding extension within an evolutionary algorithm framework. The efficacy of alternative seeding mechanisms is evaluated across a range of exemplary problem instances using ranking-based and quantitative statistical comparisons, which combines performance measurements taken throughout the optimization. Our detailed experiments show that the different dynamic TTP instances present varying difficulty to the seeding methods tested. We posit the dynamic TTP as a suitable benchmark capable of generating problem instances with different controllable characteristics aligning with many real-world problems

    Reproducibility and Baseline Reporting for Dynamic Multi-objective Benchmark Problems

    Full text link
    Dynamic multi-objective optimization problems (DMOPs) are widely accepted to be more challenging than stationary problems due to the time-dependent nature of the objective functions and/or constraints. Evaluation of purpose-built algorithms for DMOPs is often performed on narrow selections of dynamic instances with differing change magnitude and frequency or a limited selection of problems. In this paper, we focus on the reproducibility of simulation experiments for parameters of DMOPs. Our framework is based on an extension of PlatEMO, allowing for the reproduction of results and performance measurements across a range of dynamic settings and problems. A baseline schema for dynamic algorithm evaluation is introduced, which provides a mechanism to interrogate performance and optimization behaviours of well-known evolutionary algorithms that were not designed specifically for DMOPs. Importantly, by determining the maximum capability of non-dynamic multi-objective evolutionary algorithms, we can establish the minimum capability required of purpose-built dynamic algorithms to be useful. The simplest modifications to manage dynamic changes introduce diversity. Allowing non-dynamic algorithms to incorporate mutated/random solutions after change events determines the improvement possible with minor algorithm modifications. Future expansion to include current dynamic algorithms will enable reproduction of their results and verification of their abilities and performance across DMOP benchmark space.Comment: Accepted for publication in "Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) 2022

    Hierarchy and Misalignments in Complex New Product Development Projects

    Get PDF
    Developing complex new products requires firms to break down the product into subsystems and create an organizational structure which ideally mirrors the product architecture. However, empirical evidence on the mirroring hypothesis is mixed and misalignments occur in the product and the corresponding organizational architectures. Misalignments take two general forms: (1) a missing link between two teams responsible for two interacting subsystems results in an unmatched interface and (2) two teams interacting without a link between their respective subsystems cause an unmatched interaction. In a model of product design as a search on a rugged landscape, we model misalignments as design teams searching on a “perceived” rather than “real” landscape. As a consequence, type-I or type-II errors are likely whereby the former causes the teams to reject superior designs and the latter to accept inferior designs. We study the performance deterioration by two measures: the magnitude and frequency of errors. We show that unmatched interactions cause a higher type-I error both in magnitude and frequency. Unmatched interactions and interfaces cause the same magnitude of type-II error but unmatched interfaces cause a higher frequency of type-II error. We further study how misalignments affect the convergence behavior of the search process, i.e., the time to converge and the quality of the final design. We find that misalignments affect, though not necessarily increase, the convergence time significantly but they are not a critical factor in the final design quality. We discuss the managerial implications of our results for the new product development projects

    FireRisk: A Remote Sensing Dataset for Fire Risk Assessment with Benchmarks Using Supervised and Self-supervised Learning

    Full text link
    In recent decades, wildfires, as widespread and extremely destructive natural disasters, have caused tremendous property losses and fatalities, as well as extensive damage to forest ecosystems. Many fire risk assessment projects have been proposed to prevent wildfires, but GIS-based methods are inherently challenging to scale to different geographic areas due to variations in data collection and local conditions. Inspired by the abundance of publicly available remote sensing projects and the burgeoning development of deep learning in computer vision, our research focuses on assessing fire risk using remote sensing imagery. In this work, we propose a novel remote sensing dataset, FireRisk, consisting of 7 fire risk classes with a total of 91872 labelled images for fire risk assessment. This remote sensing dataset is labelled with the fire risk classes supplied by the Wildfire Hazard Potential (WHP) raster dataset, and remote sensing images are collected using the National Agriculture Imagery Program (NAIP), a high-resolution remote sensing imagery program. On FireRisk, we present benchmark performance for supervised and self-supervised representations, with Masked Autoencoders (MAE) pre-trained on ImageNet1k achieving the highest classification accuracy, 65.29%. This remote sensing dataset, FireRisk, provides a new direction for fire risk assessment, and we make it publicly available on https://github.com/CharmonyShen/FireRisk.Comment: 10 pages, 6 figures, 1 table, 1 equatio

    Statistical Modelling for Simulating and Interpreting an Egg Packaging Process for Giveaway Mitigation

    Get PDF
    Giveaway, the excess product being packed into orders, is one of the contributors of revenue loss that pre-packaged food manufacturers care the most. In collaboration with an egg packaging company, this study aims to discover operation rules to mitigate the giveaway in egg orders. For that, two variables have been raised as potential controllable factors of giveaway. One statistical model has been developed to better interpret the experimental results by understanding the underlying rules of the egg grading machine. The experiments have been accurately reproduced by a simulation using the estimated model parameters, which indicates the success of the model. Based on the experiments, we claim that the number of accepted downgrade grades has a significant influence on the final giveaway ratio. Limitations and further potentials of the statistical model have also been discussed

    Incorporating feature ranking and evolutionary methods for the classification of high-dimensional DNA microarray gene expression data

    Get PDF
    BackgroundDNA microarray gene expression classification poses a challenging task to the machine learning domain. Typically, the dimensionality of gene expression data sets could go from several thousands to over 10,000 genes. A potential solution to this issue is using feature selection to reduce the dimensionality.AimThe aim of this paper is to investigate how we can use feature quality information to improve the precision of microarray gene expression classification tasks. Method  We propose two evolutionary machine learning models based on the eXtended Classifier System (XCS) and a typical feature selection methodology. The first one, which we call FS-XCS, uses feature selection for feature reduction purposes. The second model is GRD-XCS, which uses feature ranking to bias the rule discovery process of XCS.ResultsThe  results   indicate   that  the  use  of   feature  selection / ranking methods is essential for tackling high-dimensional classification tasks, such as microarray gene expression classification. However, the results also suggest that using feature ranking to bias the rule discovery process performs significantly better than using the feature reduction method. In other words, using feature quality information to develop a smarter learning procedure is more efficient than reducing the feature set. ConclusionOur findings have shown that extracting feature quality information can assist the learning process and improve classification accuracy. On the other hand, relying exclusively on the feature quality information might potentially decrease the classification performance (e.g., using feature reduction). Therefore, we recommend a hybrid approach that uses feature quality information to direct the learning process by highlighting the more informative features, but at the same time not restricting the learning process to explore other features

    Development of Learning Objectives to Guide Enhancement of Chronic Disease Prevention and Management Curricula in Undergraduate Medical Education

    Get PDF
    Phenomenon: Chronic disease is a leading cause of death and disability in the United States. With an increase in the demand for healthcare and rising costs related to chronic care, physicians need to be better trained to address chronic disease at various stages of illness in a collaborative and cost-effective manner. Specific and measurable learning objectives are key to the design and evaluation of effective training, but there has been no consensus on chronic disease learning objectives appropriate to medical student education. Approach: Wagner’sChronic Care Model (CCM) was selected as a theoretical framework to guide development of an enhanced chronic dis-ease prevention and management (CDPM) curriculum. Findings of a literature review of CDPM competencies, objectives, and topical statements were mapped to each of the six domains of the CCM to understand the breadth of existing learning topics within each domain. At an in-person meeting, medical educators prepared a survey for the modified Delphi approach. Attendees iden-tified 51 possible learning objectives from the literature review mapping, rephrased the CCM domains as competencies, constructed possible CDPM learning objectives for each competency with the goal of reaching multi-institutional consensus on a limited number of CDPM learning objectives that would be feasible for institutions to use to guide enhancement of medical student curricula related to CDPM. After the meeting, the group developed a survey which included 39 learning objectives. In the study phase of the modified Delphi approach, 32 physician CDPM experts and educators completed an online survey to prioritize the top 20 objectives. The next step occurred at a CDPM interest group in-person meeting with the goal of identifying the top 10 objectives. Findings: The CCM domains were reframed as the following competencies for medical student education: patient self-care management, decision support, clinical information systems, community resources, delivery systems and teams, and health system practice and improvement. Eleven CDPM learning objectives were identified within the six competencies that were most important in developing curriculum for medical students. Insights: These learning objectives cut across education on the prevention and management of individual chronic diseases and frame chronic disease care as requiring the health system science competencies identified in the CCM. They are intended to be used in combination with traditional disease-specific pathophysiology and treatment objectives. Additional efforts are needed to identify specific curricular strategies and assessment tools for each learning objective
    corecore